232 research outputs found

    Probabilistic Multilevel Clustering via Composite Transportation Distance

    Full text link
    We propose a novel probabilistic approach to multilevel clustering problems based on composite transportation distance, which is a variant of transportation distance where the underlying metric is Kullback-Leibler divergence. Our method involves solving a joint optimization problem over spaces of probability measures to simultaneously discover grouping structures within groups and among groups. By exploiting the connection of our method to the problem of finding composite transportation barycenters, we develop fast and efficient optimization algorithms even for potentially large-scale multilevel datasets. Finally, we present experimental results with both synthetic and real data to demonstrate the efficiency and scalability of the proposed approach.Comment: 25 pages, 3 figure

    Revisiting Sliced Wasserstein on Images: From Vectorization to Convolution

    Full text link
    The conventional sliced Wasserstein is defined between two probability measures that have realizations as vectors. When comparing two probability measures over images, practitioners first need to vectorize images and then project them to one-dimensional space by using matrix multiplication between the sample matrix and the projection matrix. After that, the sliced Wasserstein is evaluated by averaging the two corresponding one-dimensional projected probability measures. However, this approach has two limitations. The first limitation is that the spatial structure of images is not captured efficiently by the vectorization step; therefore, the later slicing process becomes harder to gather the discrepancy information. The second limitation is memory inefficiency since each slicing direction is a vector that has the same dimension as the images. To address these limitations, we propose novel slicing methods for sliced Wasserstein between probability measures over images that are based on the convolution operators. We derive convolution sliced Wasserstein (CSW) and its variants via incorporating stride, dilation, and non-linear activation function into the convolution operators. We investigate the metricity of CSW as well as its sample complexity, its computational complexity, and its connection to conventional sliced Wasserstein distances. Finally, we demonstrate the favorable performance of CSW over the conventional sliced Wasserstein in comparing probability measures over images and in training deep generative modeling on images.Comment: 34 pages, 12 figures, 10 table

    Energy-Based Sliced Wasserstein Distance

    Full text link
    The sliced Wasserstein (SW) distance has been widely recognized as a statistically effective and computationally efficient metric between two probability measures. A key component of the SW distance is the slicing distribution. There are two existing approaches for choosing this distribution. The first approach is using a fixed prior distribution. The second approach is optimizing for the best distribution which belongs to a parametric family of distributions and can maximize the expected distance. However, both approaches have their limitations. A fixed prior distribution is non-informative in terms of highlighting projecting directions that can discriminate two general probability measures. Doing optimization for the best distribution is often expensive and unstable. Moreover, designing the parametric family of the candidate distribution could be easily misspecified. To address the issues, we propose to design the slicing distribution as an energy-based distribution that is parameter-free and has the density proportional to an energy function of the projected one-dimensional Wasserstein distance. We then derive a novel sliced Wasserstein metric, energy-based sliced Waserstein (EBSW) distance, and investigate its topological, statistical, and computational properties via importance sampling, sampling importance resampling, and Markov Chain methods. Finally, we conduct experiments on point-cloud gradient flow, color transfer, and point-cloud reconstruction to show the favorable performance of the EBSW.Comment: 36 pages, 7 figures, 6 table

    Amortized Projection Optimization for Sliced Wasserstein Generative Models

    Full text link
    Seeking informative projecting directions has been an important task in utilizing sliced Wasserstein distance in applications. However, finding these directions usually requires an iterative optimization procedure over the space of projecting directions, which is computationally expensive. Moreover, the computational issue is even more severe in deep learning applications, where computing the distance between two mini-batch probability measures is repeated several times. This nested loop has been one of the main challenges that prevent the usage of sliced Wasserstein distances based on good projections in practice. To address this challenge, we propose to utilize the learning-to-optimize technique or amortized optimization to predict the informative direction of any given two mini-batch probability measures. To the best of our knowledge, this is the first work that bridges amortized optimization and sliced Wasserstein generative models. In particular, we derive linear amortized models, generalized linear amortized models, and non-linear amortized models which are corresponding to three types of novel mini-batch losses, named amortized sliced Wasserstein. We demonstrate the favorable performance of the proposed sliced losses in deep generative modeling on standard benchmark datasets.Comment: Accepted to NeurIPS 2022, 22 pages, 6 figures, 8 table
    • …
    corecore